Grammatical Inference and Computational Linguistics
نویسندگان
چکیده
When dealing with language, (machine) learning can take many different faces, of which the most important are those concerned with learning languages and grammars from data. Questions in this context have been at the intersection of the fields of inductive inference and computational linguistics for the past fifty years. To go back to the pioneering work, Chomsky (1955; 1957) and Solomonoff (1960; 1964) were interested, for very different reasons, in systems or programs that could deduce a language when presented information about it. Gold (1967; 1978) proposed a little later a unifying paradigm called identification in the limit, and the term of grammatical inference seems to have appeared in Horning’s PhD thesis (1969). Out of the scope of linguistics, researchers and engineers dealing with pattern recognition, under the impulsion of Fu (1974; 1975), invented algorithms and studied subclasses of languages and grammars from the point of view of what could or could not be learned. Researchers in machine learning tackled related problems (the most famous being that of inferring a deterministic finite automaton, given examples and counter-examples of strings). Angluin (1978; 1980; 1981; 1982; 1987) introduced the important setting of active learning, or learning for queries, whereas Pitt and his colleagues (1988; 1989; 1993) gave several complexity inspired results with which the hardness of the different learning problems was exposed. Researchers working in more applied areas, such as computational biology, also deal with strings. A number of researchers from that field worked on learning grammars or automata from string data (Brazma and Cerans, 1994; Brazma, 1997; Brazma et al., 1998). Similarly, stemming from computational linguistics, one can point out the work relating language learning with more complex grammatical formalisms (Kanazawa, 1998), the more statistical approaches based on building language models (Goodman, 2001), or the different systems introduced to automatically build grammars from sentences (van Zaanen, 2000; Adriaans and Vervoort, 2002). Surveys of related work in specific fields can also be found (Natarajan, 1991; Kearns and Vazirani, 1994; Sakakibara, 1997; Adriaans and van Zaanen, 2004; de la Higuera, 2005; Wolf, 2006).
منابع مشابه
Learning Automata and Grammars
The problem of learning or inferring automata and grammars has been studied for decades and has connections to many disciplines, including bioinformatics, computational linguistics and pattern recognition. In this paper we present a short survey of basic models and techniques related to the grammatical inference and try to outline some new promising approaches which we expect to bring new light...
متن کاملOn Structural Inference for XML Data
Semistructured data presents many challenges, mainly due to its lack of a strict schema. These challenges are further magnified when large amounts of data are gathered from heterogeneous sources. We address this by investigation and development of methods to automatically infer structural information from example data. Using XML as a reference format, we approach the schema generation problem b...
متن کاملLinguistic Representation and Gricean Inference
An essential ingredient of language use is our ability to reason about utterances as intentional actions. Linguistic representations are the natural substrate for such reasoning, and models from computational semantics can often be seen as providing an infrastructure to carry out such inferences from rich and accurate grammatical descriptions. Exploring such inferences offers a productive pragm...
متن کاملCurrent Trends in Grammatical Inference
Grammatical inference has historically found it’s first theoretical results in the field of inductive inference, but it’s first applications in the one of Syntactic and Structural Pattern Recognition. In the mid nineties, the field emancipated and researchers from a variety of communities moved in: Computational Linguistics, Natural Language Processing, Algorithmics, Speech Recognition, Bio-Inf...
متن کاملFunctional analysis of Subject and Verb in Theses Abstracts on Applied Linguistics
The purpose of the present study is to analyse abstracts related to Applied Linguistics, and more precisely the discourse functions of grammatical subjects and verbs. The corpus consisted of 50 PhD thesis abstracts written on the subject of Applied Linguistics. All of the abstracts were written from 2010 to 2014. The theses from which the abstracts were extracted are available in the ProQuest d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009